Conversation
…into issue1564
for more information, see https://pre-commit.ci
geetu040
left a comment
There was a problem hiding this comment.
please sync with base PR and update with these comments #1576 (comment)
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 52 out of 53 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| configurable_fields = [f for f in config._defaults if f not in ["max_retries"]] | ||
| configurable_fields = [ | ||
| f.name for f in fields(openml._config.OpenMLConfig) if f.name not in ["max_retries"] |
There was a problem hiding this comment.
The code filters out a field named 'max_retries', but the OpenMLConfig dataclass does not contain a field with this name. The actual field is connection_n_retries. This filter condition should be updated to match the actual field name or removed if no longer needed.
| f.name for f in fields(openml._config.OpenMLConfig) if f.name not in ["max_retries"] | |
| f.name | |
| for f in fields(openml._config.OpenMLConfig) | |
| if f.name not in ["connection_n_retries"] |
| # Example script which will appear in the upcoming OpenML-Python paper | ||
| # This test ensures that the example will keep running! | ||
| with overwrite_config_context( | ||
| with openml.config.overwrite_config_context( # noqa: F823 |
There was a problem hiding this comment.
The # noqa: F823 comment appears incorrect for this context. F823 is a Flake8 error code for "local variable referenced before assignment", which doesn't apply here. This line should likely use # noqa: F401 (unused import) if anything, but since openml is used on line 12, it's likely not needed at all and should be removed.
…into runs-migration-stacked
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 56 out of 57 changed files in this pull request and generated 10 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def test_switch_to_example_configuration(self): | ||
| """Verifies the test configuration is loaded properly.""" | ||
| # Below is the default test key which would be used anyway, but just for clarity: | ||
| openml.config.apikey = "any-api-key" | ||
| openml.config.server = self.production_server | ||
| openml.config.set_servers("production") | ||
|
|
||
| openml.config.start_using_configuration_for_example() | ||
|
|
||
| assert openml.config.apikey == TestBase.user_key | ||
| assert openml.config.server == self.test_server | ||
| openml.config.servers = openml.config.get_servers("test") |
There was a problem hiding this comment.
This test doesn't verify that start_using_configuration_for_example() actually switched to the test server configuration. It only sets openml.config.servers to test servers after the switch, which doesn't test the intended behavior. Add assertions to verify that the configuration was correctly switched to test servers.
| def test_switch_from_example_configuration(self): | ||
| """Verifies the previous configuration is loaded after stopping.""" | ||
| # Below is the default test key which would be used anyway, but just for clarity: | ||
| openml.config.apikey = TestBase.user_key | ||
| openml.config.server = self.production_server | ||
| openml.config.set_servers("production") | ||
|
|
||
| openml.config.start_using_configuration_for_example() | ||
| openml.config.stop_using_configuration_for_example() | ||
|
|
||
| assert openml.config.apikey == TestBase.user_key | ||
| assert openml.config.server == self.production_server | ||
| openml.config.servers = openml.config.get_servers("production") |
There was a problem hiding this comment.
This test doesn't verify that stop_using_configuration_for_example() actually restored the production server configuration. It only sets openml.config.servers to production servers after the stop, which doesn't test the intended behavior. Add assertions to verify that the configuration was correctly restored.
tests/test_api/test_run.py
Outdated
| assert "flow_id" in runs_df.columns | ||
|
|
||
|
|
||
| def test_run_v1_publish_mocked(run_v1, use_api_v1, test_api_key): |
There was a problem hiding this comment.
The fixture name test_api_key is not defined in conftest.py. The available fixture is named test_apikey_v1. Either rename this parameter to test_apikey_v1 or create a fixture named test_api_key.
|
|
||
| def _mocked_perform_api_call(call, request_method): | ||
| url = openml.config.server + "/" + call | ||
| url = openml.config.server + call |
There was a problem hiding this comment.
There's an extra space before the + operator on this line. While this doesn't affect functionality, it's inconsistent with the surrounding code style. Consider removing the extra space.
tests/test_api/test_run.py
Outdated
| assert "flow_id" in runs_df.columns | ||
|
|
||
|
|
||
| def test_run_v1_publish_mocked(run_v1, use_api_v1, test_api_key): |
There was a problem hiding this comment.
The fixture name use_api_v1 is not defined anywhere in the codebase. This fixture is referenced but never created, which will cause the test to fail.
tests/test_api/test_run.py
Outdated
| ) | ||
|
|
||
|
|
||
| def test_run_v1_delete_mocked(run_v1, use_api_v1, test_api_key): |
There was a problem hiding this comment.
The fixture name test_api_key is not defined in conftest.py. The available fixture is named test_apikey_v1. Either rename this parameter to test_apikey_v1 or create a fixture named test_api_key.
tests/test_api/test_run.py
Outdated
| ) | ||
|
|
||
|
|
||
| def test_run_v1_delete_mocked(run_v1, use_api_v1, test_api_key): |
There was a problem hiding this comment.
The fixture name use_api_v1 is not defined anywhere in the codebase. This fixture is referenced but never created, which will cause the test to fail.
| "openml_logger", | ||
| "_examples", | ||
| "OPENML_CACHE_DIR_ENV_VAR", | ||
| "OPENML_SKIP_PARQUET_ENV_VAR", |
There was a problem hiding this comment.
The attribute OPENML_TEST_SERVER_ADMIN_KEY_ENV_VAR is set in __init__ but is not included in the whitelist in __setattr__. This could cause issues if code tries to set this attribute after initialization. Add it to the whitelist on line 166-177.
| "OPENML_SKIP_PARQUET_ENV_VAR", | |
| "OPENML_SKIP_PARQUET_ENV_VAR", | |
| "OPENML_TEST_SERVER_ADMIN_KEY_ENV_VAR", |
| "_examples", | ||
| "OPENML_CACHE_DIR_ENV_VAR", | ||
| "OPENML_SKIP_PARQUET_ENV_VAR", | ||
| "_HEADERS", |
There was a problem hiding this comment.
The attribute _defaults is set in __init__ but is not included in the whitelist in __setattr__. This could cause issues if code tries to set this attribute after initialization. Add it to the whitelist on line 166-177.
| "_HEADERS", | |
| "_HEADERS", | |
| "_defaults", |
| self._config = replace( | ||
| self._config, | ||
| servers=config["servers"], | ||
| api_version=config["api_version"], | ||
| fallback_api_version=config["fallback_api_version"], | ||
| show_progress=config["show_progress"], | ||
| avoid_duplicate_runs=config["avoid_duplicate_runs"], | ||
| retry_policy=config["retry_policy"], | ||
| connection_n_retries=int(config["connection_n_retries"]), | ||
| ) |
There was a problem hiding this comment.
The _setup method expects the config dictionary to contain fields like "servers", "api_version", and "fallback_api_version", but _parse_config returns the raw config parser output which may not contain these fields unless they are explicitly set in the config file. This will cause a KeyError when trying to access these fields. Either add default handling for missing fields or ensure _parse_config returns all required fields with defaults.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 56 out of 57 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| self._config = replace( | ||
| self._config, | ||
| servers=config["servers"], | ||
| api_version=config["api_version"], | ||
| fallback_api_version=config["fallback_api_version"], |
There was a problem hiding this comment.
OpenMLConfigManager._setup() assumes values returned by _parse_config() are already typed (e.g. servers as a dict and api_version as APIVersion). However, configparser.RawConfigParser(...).items() returns strings, so servers, api_version, and fallback_api_version will be stringified defaults when the config file is missing/legacy, breaking later indexing like self.servers[self.api_version]. Consider keeping these fields out of the config file (or explicitly parsing them) and only updating them when they’re provided as proper objects (e.g. from in-memory overrides/tests).
| self._config = replace( | |
| self._config, | |
| servers=config["servers"], | |
| api_version=config["api_version"], | |
| fallback_api_version=config["fallback_api_version"], | |
| # Determine servers configuration, ensuring it remains a properly typed dict. | |
| servers = self._config.servers | |
| if "servers" in config and isinstance(config["servers"], dict): | |
| servers = config["servers"] | |
| # Determine API version, accepting either an APIVersion instance or a convertible string. | |
| api_version = self._config.api_version | |
| if "api_version" in config: | |
| raw_api_version = config["api_version"] | |
| if isinstance(raw_api_version, APIVersion): | |
| api_version = raw_api_version | |
| elif isinstance(raw_api_version, str): | |
| try: | |
| api_version = APIVersion(raw_api_version) | |
| except ValueError: | |
| self.openml_logger.warning( | |
| "Invalid api_version '%s' in configuration; using default '%s'.", | |
| raw_api_version, | |
| self._config.api_version.value, | |
| ) | |
| # Determine fallback API version, allowing None, APIVersion, or a convertible string. | |
| fallback_api_version = self._config.fallback_api_version | |
| if "fallback_api_version" in config: | |
| raw_fallback = config["fallback_api_version"] | |
| if raw_fallback is None or raw_fallback == "": | |
| fallback_api_version = None | |
| elif isinstance(raw_fallback, APIVersion): | |
| fallback_api_version = raw_fallback | |
| elif isinstance(raw_fallback, str): | |
| try: | |
| fallback_api_version = APIVersion(raw_fallback) | |
| except ValueError: | |
| self.openml_logger.warning( | |
| "Invalid fallback_api_version '%s' in configuration; using default '%s'.", | |
| raw_fallback, | |
| ( | |
| self._config.fallback_api_version.value | |
| if self._config.fallback_api_version is not None | |
| else None | |
| ), | |
| ) | |
| self._config = replace( | |
| self._config, | |
| servers=servers, | |
| api_version=api_version, | |
| fallback_api_version=fallback_api_version, |
| OpenMLHashException | ||
| If checksum verification fails. | ||
| """ | ||
| url = urljoin(self.server, path) |
There was a problem hiding this comment.
urljoin(self.server, path) will drop the last path segment if self.server does not end with / (e.g. base .../api/v1/xml + task/1 becomes .../api/v1/task/1). Since openml.config.server can be set by users without a trailing slash (and legacy configs likely do), normalize the base URL (ensure trailing slash) before calling urljoin, or avoid urljoin for simple concatenation.
| url = urljoin(self.server, path) | |
| url = f"{self.server.rstrip('/')}/{path.lstrip('/')}" |
| if TYPE_CHECKING: | ||
| from ._config import OpenMLConfigManager | ||
|
|
||
| config: OpenMLConfigManager = _config_module.__config |
There was a problem hiding this comment.
openml.config is now an attribute (config manager instance), but the openml/config.py module was removed. This breaks existing user code that does import openml.config or patches via that module path. Consider restoring a lightweight openml/config.py shim (re-exporting from openml._config) or registering an alias in sys.modules to preserve the import path for backward compatibility.
| Parameters | ||
| ---------- | ||
| config : Config | ||
| Configuration object containing API versions, endpoints, cache | ||
| settings, and connection parameters. |
There was a problem hiding this comment.
The build() docstring documents a config : Config parameter, but the method signature is (api_version, fallback_api_version) and there is no config argument. Update the docstring to match the actual parameters to avoid misleading API documentation.
| def sample_download_url_v1(test_server_v1) -> str: | ||
| server = test_server_v1.split("api/")[0] | ||
| endpoint = "data/v1/download/1/anneal.arff" | ||
| url = server + endpoint | ||
| return url |
There was a problem hiding this comment.
sample_download_url_v1 builds a download URL using data/v1/download/..., but the rest of the codebase (e.g. openml._api_calls._file_id_to_url) uses the unversioned /data/download/<id> pattern. If the test server doesn’t expose data/v1/download, these tests will 404. Consider generating the URL via the same helper used in production code (or align the hardcoded path with the actual download endpoint).
| @pytest.fixture | ||
| def dummy_task_v2(http_client_v2, minio_client) -> DummyTaskV1API: | ||
| return DummyTaskV2API(http=http_client_v2, minio=minio_client) | ||
|
|
||
|
|
||
| @pytest.fixture | ||
| def dummy_task_fallback(dummy_task_v1, dummy_task_v2) -> DummyTaskV1API: | ||
| return FallbackProxy(dummy_task_v2, dummy_task_v1) |
There was a problem hiding this comment.
The fixture return type annotations don’t match what’s returned: dummy_task_v2 returns DummyTaskV2API (not DummyTaskV1API), and dummy_task_fallback returns a FallbackProxy. Even if mypy ignores tests, keeping the annotations accurate helps IDEs/readability.
Metadata
Details
fixes #1624